Skip to content

Add LiraSearch tool#1604

Merged
simonbray merged 26 commits intobgruening:masterfrom
simonbray:shcoeffs
Jun 26, 2025
Merged

Add LiraSearch tool#1604
simonbray merged 26 commits intobgruening:masterfrom
simonbray:shcoeffs

Conversation

@simonbray
Copy link
Copy Markdown
Collaborator

Completely WIP, the tool doesn't even have an official name yet.

@simonbray
Copy link
Copy Markdown
Collaborator Author

@bgruening do you have an example of a tool that accesses a database? I don't think I had experience writing such a tool.

@simonbray
Copy link
Copy Markdown
Collaborator Author

@bgruening do you have an example of a tool that accesses a database? I don't think I had experience writing such a tool.

Hello @bgruening, I was wondering if you could comment on this?

What I meant with 'database' was anything accessed by the tool (could be e.g. a JSON file, but we would ideally like to use RocksDB: https://rocksdb.org/docs/getting-started.html). For a small file, I would put it into $__tool_directory__, but I guess this is a bad idea for something which is several GB in size.

@bgruening
Copy link
Copy Markdown
Owner

The database is something like a reference that the tool needs during runtime? But is read-only?

@bgruening
Copy link
Copy Markdown
Owner

In that case you can use location files (.loc) files. Those are tabular files that contain a bit of metadata for your "db" and the path to the DB. An admin can then configure this file and downloads the big DB and puts them somewhere locally (assuming the DB is a flat file, and not a server.)

@simonbray
Copy link
Copy Markdown
Collaborator Author

The database is something like a reference that the tool needs during runtime? But is read-only?

Yes, exactly. Thanks a lot :)

@bgruening
Copy link
Copy Markdown
Owner

Similar to this tool: https://github.com/bgruening/galaxytools/blob/master/tools/wordcloud/test-data/fonts.loc

With a loc file for fonts.

@simonbray simonbray changed the title [WIP] add shcoeffs tool [WIP] add LiraSearch tool Jun 13, 2025
@simonbray
Copy link
Copy Markdown
Collaborator Author

@bgruening here is the Dockerfile, if you have any suggestions: https://github.com/simonbray/SHCoeffsDev/blob/main/Dockerfile

The image is 9 GB, but there is a lot in there😬

@bgruening
Copy link
Copy Markdown
Owner

Is the container smaller? Still WIP?

@simonbray
Copy link
Copy Markdown
Collaborator Author

It is 6.5 GB now.

Still WIP :)

@simonbray
Copy link
Copy Markdown
Collaborator Author

@bgruening I am a bit stuck with the loc file which I tried to implement in the last commit.

Tests are failing with Parameter 'database_select': requires a value, but no legal values defined, can you see if there is something I missed?

@simonbray
Copy link
Copy Markdown
Collaborator Author

Oh, I missed the tool_data_table_conf.xml.sample file! Let me try adding it

@bgruening
Copy link
Copy Markdown
Owner

Caused by: java.io.FileNotFoundException: /opt/ignite/storage/node00-bfcdcb74-02d8-4121-b8f8-af2d7efdb83e/lock (Read-only file system)

@bgruening
Copy link
Copy Markdown
Owner

Does that path belong to a different user? Does it have world writeable permissions?

@simonbray
Copy link
Copy Markdown
Collaborator Author

Does that path belong to a different user? Does it have world writeable permissions?

It is the mounted test-data directory (or to be precise a symlink to it).

When the tool is installed, will it be possible for the mounted data to be writeable? I remember we talked about it in Freiburg

@bgruening
Copy link
Copy Markdown
Owner

Depends in the deployment, but I would always assume that most mounted volumes are read-only. Only the job dir is writable.

@bgruening
Copy link
Copy Markdown
Owner

Failed to start grid: /opt/ignite/storage/node00-bfcdcb74-02d8-4121-b8f8-af2d7efdb83e/lock (Read-only file system)

@bgruening
Copy link
Copy Markdown
Owner

A bunch of new errors but it looks like solid progress :)

Traceback (most recent call last):
  File "/opt/ignite/scripts/lira_search_sdf.py", line 159, in <module>
    main()
  File "/opt/ignite/scripts/lira_search_sdf.py", line 151, in main
    sdf_file.write(''.join(molecule))
                   ^^^^^^^^^^^^^^^^^
TypeError: can only join an iterable

@simonbray simonbray changed the title [WIP] add LiraSearch tool Add LiraSearch tool Jun 25, 2025
@simonbray
Copy link
Copy Markdown
Collaborator Author

@bgruening I think this is finally working as it should be.

The idea is that the user can choose between multiple databases to search. Each one will be a separate directory similar to storage in the test-data. These directories need to be mounted into the container.

@@ -0,0 +1,6 @@
<tables>
<table name="lirasearch" comment_char="#">
<columns>value, name, path</columns>
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a version column and filter by the version inside you tool ...

This will be good for the future, just imagine you are changing the DB layout, then new tools can only show DB that are fittting with the DB version.

Copy link
Copy Markdown
Owner

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just one small comment

@simonbray
Copy link
Copy Markdown
Collaborator Author

@bgruening like so?

@simonbray simonbray merged commit f6ee398 into bgruening:master Jun 26, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants